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Abstract 

A new algorithm for minimization of quantum cost of classical reversible and quantum circuits have been designed. 
The quantum costs obtained using the proposed algorithm is compared with the existing results and it is found that 
the algorithm produces minimum quantum cost in all cases. 
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1 Introduction 

According to Landauer's principle [T any logically irreversible operation on information, is always associated with a loss of 
energy. For example, each bit of lost information leads to the release of at least kTln2 amount of heat. This type of energy 
loss is expected to become a substantial part of energy dissipation in VLSI circuits in near future. The energy dissipation 
problem of VLSI circuits can be circumvented by using reversible logic because reversible computation does not require 
to erase any bit of information. This observation has motivated scientists to design reversible circuits for various purposes 
P]-|28|. If the reversible circuit implements quantum computation and it comprises of quantum gates then it is a quantum 
circuit and if the reversible circuit implements only classical computation (boolean logic) then it is a classical reversible 
circuit. In the area of quantum computing several new possibilities appeared which are impossible in classical domain. To 
be precise, quantum teleportation |14| . infinitesimally secured cryptography |15j and super dense-coding |16| do not have 
any classical analogue. All these unique features of quantum communication are associated with some circuits which are 
reversible in nature. In other words, we require quantum circuits to implement quantum algorithms and protocols. For 
example, circuits are proposed for implementation of Shor's algorithm [17], quantum teleportation [14|, various attacks 
on quantum key distribution protocols jl5l [T5] , super dense coding [TB] , quantum error correction [19J [5U] , fault tolerant 
quantum computation [21,-[23., Grover's algorithm nondestructive discrimination of Bell states |57], quantum 

circuits for addition |28| etc. Here we would like to note that all quantum mechanical operations are reversible and the 
only difference between a classical reversible gate and a quantum gate is that the classical reversible gate can not handle 
superposition of states (qubit). Consequently, set of all classical reversible gates form a subset of set of all quantum 
gates. For example, Cnot gate can be achieved in classical and quantum domains but the Hadamard gate can be achieved 
in quantum domain only. Therefore, classical reversible circuits are only a subset of quantum circuits and any protocol 
designed for optimization of particular parameter related to quantum circuits will also be valid for classical reversible 
circuits. 

We have tried to explain the requirement and beauty of quantum circuits and now the question arises: How to obtain 
these circuits? There exist several algorithms for synthesis of classical reversible circuits ^ i3, i5, and quantum circuits 
|29j-|31|. But these algorithms do not provide a unique output. For example, a reversible multiplier can be achieved in 
many ways [5]-|13j. Therefore, a quantitative measure of the quality of a circuit is required. Some of the important 
quantitative measures are gate count, number of garbage bits and quantum cost. Gate count is the total number of gates 
in a circuit, but there is a specific problem with this quantitative measure of circuit quality. Specially it is not unique. If 
one is allowed to introduce a new gate or a complex gate library then the gate count can be considerably reduced. An 
n-qubit reversible gate is represented by 2" x 2" unitary matrix. Product of any arbitrary number of unitary matrices 
is always unitary. Moreover serial connection of such gates correspond to multiplication of their matrices and parallel 
connection corresponds to tensor multiplication of their matrices. Therefore, if we put a set of reversible quantum gate 
in a black box then it can be visualized as a new gate. Thus the gate count can be reduced to 1. For example in [32 the 
circuit cost of a full adder circuit from NCT|^gate library is 4, in [12_ it is reduced to 2 by using Peres gate and in [llj it 
is reduced to 1 by using a new gate. All the differences in circuit cost of full adder is because of choice of non unique gate 
libraries. Consequently it is important to define an unique gate library for comparison of circuit cost. Further, a good 
quantum circuit requires minimum number of garbage bits. This is so because garbage bit is defined as an additional 
output bit which is required to make a function reversible and it is not used for further computations. The quantum 
coslj^ [H [33., .34] of a reversible circuit is the number of primitive quantum gates needed to implement a circuit. Primitive 

^NCT gate library is a universal gate library 1321 comprising of NOT, Cnot and Toffoli gates. 
■^Definition of quantum cost is discussed in detail in section 2. 
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quantum gates are the elementary building blocks |31]-[3^, like Not gate, Cnot gate, controUed-z;, controlled-?;"'", rotation 
gates etc. We can construct TofFoli gate with square root of Not gate (V) and Cnot gate and in that construction the 
minimum gate count of TofFoli gate is 5 [SS] and its quantum cost is also 5. These requirements yield separate measures 
of quality of a quantum circuit. To be precise, the circuit is better if it has lesser number of garbage bits, circuit cost and 
quantum cost. But it is often observed that reduction of circuit cost leads to increase in garbage bits and reduction of 
quantum cost leads to increase in circuit cost |12| . Keeping these in mind we have recently introduced a new parameter 
called Total cost (TC) |T3] which is the sum of gate count of an optimized circuit, number of garbage bits and quantum 
cost. For reduction of TC it is required to simultaneously reduce the circuit cost, garbage count and quantum cost. This 
is an open problem and at present neither an algorithm for simultaneous reduction of all these measures nor a satisfactory 
algorithm for reduction of quantum cost exist. Before we address the more complex problem of minimization of TC we 
have to device a protocol for reduction of quantum cost. In some works [H [51 [TTJ [3S] the quantum cost is calculated 
straight by adding the quantum cost of respective reversible gates in the circuit or it is optimized by applying deletion rule 
only. A simple minded systematic approach is also proposed by Maslov et al. |36l I37| . These facts have motivated us to 
design an algorithm for minimization of quantum cost and to apply our algorithm to compute quantum cost of different 
circuits in gl |32l [Ml EH] . 

In the next section we have discussed the earlier approaches and their limitations. In section 3, we have proposed an 
algorithm for calculating the quantum cost of reversible circuits. Here we have also compared our results with the earlier 
proposals [U [SI [351 [3S1 [35] to establish that the quantum cost computed by the proposed algorithm is minimum. Finally 
we conclude in section 4. 

2 Previous works 

Cost of an arbitrary unitary gate was first introduced by Barenco et al. in 1995 J34j. They had considered all 2 x 2 
gate and Cnot gate as basic gates and had shown that for any 2x2 unitary gate iFl we can realize C-U (corresponding 
controlled U gate) by using at most 6 basic gates. But to analyze the cost of a large gate (n bit TofFoli) he had considered 
the cost of C-U as 8(1). Next year, Smolin and DiVincenzo |33] calculated cost of Fredkin gate. In their calculation they 
went beyond the definition of Barenco et al. and assumed that the cost of every 4x4 gate is 1. This consideration does 
not have any contradiction with Barenco et al.'s definition of cost, as cost of all 2 qubit quantum gates is 8(1). Further 
progress in cost calculation was made by Perkowski et al. [31] in 2003 where they show that a one qubit gate costs nothing, 
if it precedes or follows by a 2 qubit gate. This is so because one qubit gate can be combined with the 2 qubit gate to 
yield a new two qubit gate. Thus, the cost is calculated as a total sum of 4 x 4 gates used. Following this definition 
the cost of swap gate is one and that of Peres gate is four. Peres gate is universal for reversible boolean operations and 
it has the minimum cost compared to other universal gates. This observation of Perkowski et al. had motivated others 
to use Peres gate to minimize the cost. For example, Maslov and Duek [40 have used the idea of Perkowski et al. and 
have shown that, the number of elementary quantum operations required to implement Peres gate is less so it can be 
substituted for n-bit Toffoli network to reduce the cost of n-bit Toffoli gates. Here we would like to note that in the earlier 
works |331 13^ 135] quantum cost was mentioned as cost. The term quantum cost was coined by Maslov et al. |40l HTj in 
2003, they have defined quantum cost of a gate G, as the number of elementary quantum operations required to realize 
the function given by G. Later on. Hung et al. |35] had reconsidered the quantum cost estimation protocol defined by 
Smolin and DiVincenzo [33"^. They have stated that each two qubit gate and each symmetric gate pattern (see Fig. 2 
of [42]) have quantum implementation cost 1. In essence all these definitions of quantum cost are synonymous and we 
can follow Perkowski 's definition [39] and state that the quantum cost of a classical reversible or quantum circuit is the 
minimum number of one qubit and two qubit quantum gates needed to implement the circuit. 

In recent past quantum cost of different classical reversible and quantum circuits have been reported |31 [H [S] 1111 1321 1361 
[3S][13]. Simultaneously several efforts have been made to reduce the quantum cost of different gates/circuits. For example, 
Barenco et al. [34^ estimated the cost of a 6 bit Toffoli gate as 61. Maslov and Duek [40^ reduced the quantum cost of this 
gate initially to 48 by using Peres gate. Further Maslov et al. [37] reduced the quantum cost of this gate to 38 by applying 
local optimization tools. There also exist following two online databases: i) benchmark page of Maslov et al. |32] and ii) 
Revlib [38], which include quantum cost of different circuits calculated by different authors. In 2005 Maslov et al. [37] 
have shown that a closer look into the cost metric can classify them into two subclasses: linear cost (where the quantum 
cost of a circuit is calculated as sum of quantum cost of each gate) and nonlinear cost (where local optimization algorithm 
is used). According to this classification scheme [37] quantum cost defined in Smolin and DiVincenzo [33] and Hung et 
al. |42| is nonlinear. Interestingly, Haghparast and Eshghi [4 have given following two prescriptions for calculation of 
quantum cost: 

1. Implement a circuit/gate using only the quantum primitive (2 x 2) and (4 x 4) gates and count them 

2. Synthesize the new circuit/gate using the well known gates whose quantum cost is specified and add up their quantum 
cost to calculate total quantum cost. 

qubit gate is represented by a 2" X 2" unitary matrix. Therefore a 2 X 2 and 4x4 gates correspond to 1 qubit and 2 qubit gates 
respectively. Different notations have been used in [4l [Sl [39l 1421 143| . 
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In both of these cases we will obtain linear cost metric and consequently the quantum cost obtained in these two 
procedures may be higher than the actual one unless local optimization algorithms are applied to the entire circuit. When 
we apply the local optimization algorithm on the entire circuit then we obtain a nonlinear cost metric. The proposed 
algorithm will calculate nonlinear cost metric. Local optimization is expected to play an important role in minimization 
of quantum cost. Maslov et al. |36l H51 HB] have realized this fact and have proposed an algorithm for minimization of 
quantum cost by applying templates and it yields nonlinear cost metric. 

Our current work and work of Maslov et aZ.'s is contemporaneous and independent. They differ greatly in their 
premises, methods and consequences. 1) Maslov et aUs work deal with circuit optimization precisely minimizing gate 
count by local optimization tools. They have introduced templates and applied them to optimize the gate count. In 
contrast, our algorithm exploits a conceptual difference between optimization algorithm used for reduction of gate count 
and the one used for reduction of quantum cost. 2) They are restricted to a particular gate library but to reduce the 
quantum cost we have introduced new gates as long as the gate is 2 x 2 or 4 x 4 quantum gate. 3) In their work the 
local optimization tools reduces the gate count only, but in our work it is applied to reduce the quantum cost as well. 
This is shown in Fig. 2c where moving rule [36] (which was essentially designed to reduce circuit complexity) has not 
reduced the circuit complexity but has reduced the quantum cost. This is evident in the work of Smolin [33_. Further, we 
would like to note that the quantum cost obtained by Maslov et al. is not linear and so is ours. Consequently it will be 
completely justified to compare the quantum cost obtained by our proposed algorithm with that obtained using Maslov 
et al.'s algorithm. 
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Figure 1: Algorithm for minimization of quantum cost. 



3 Optimization algorithm 

In this section we have proposed an algorithm that optimizes the quantum cost of classical reversible and quantum circuits. 
It is presented in the form of a flowchart in Fig 1. The flowchart is explained below: 

1. The input is a reversible circuit. Here we would like to note that our goal is to find out the minimum number of 
quantum primitive gates required to implement the circuit and we are not much concerned about the choice of gate 
library |321 134], [35] in principle. But in practice it is easier to work using an input circuit which is constituted using 
the gates from a standard gate library for which a large/complete set of templates are known. At present there are 
few set of templates, available for classical reversible circuits |551HB] . However not much templates [3B] are reported 
for quantum circuits and it is not difficult to generate them. Therefore, in the beginning of the algorithm we convert 
the input reversible circuit into a circuit composed of gates taken from a standard gate library preferably those gate 
libraries for which a complete/large set of templates are already known. 

2. In the next step we optimize the gate count of the reversible circuit by applying local optimization tools which are 
moving rule, deletion rule and template matching. We apply moving rule or commutation rule [35^ which is simply a 
matrix operation to see whether the adjacent gates commute or not. This operation is useful to reduce the gate count 
with the help of self inverse rule and template matching |36| . If at any point of time we find that the adjacent gates 
are of the same type and they form an identity (I) then we can remove both of the identical gates. This is called self 
inverse or deletion rule. In NCT gate library, all the gates are self inverse and in NCV gate library apart from the 
square root of Not gate (where v.v~^ = I) all the remaining gates are self inverse. In template matching [5| a sequence 
of gates is substituted by another sequence of gates having lesser number of gate count and same operational effect. 
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Figure 2: a) A Fredkin gate is implemented using three TofFoli gates, b) ToflFoli gate is substituted by primitives, so 
its direct linear cost is 5 x 3 = 15. c) Moving rule is applied (the movements are shown by arrows), d) and e) template 
matching rule is applied, f) New gates are introduced (dashed boxes) and the quantum cost is 11. g) The circuit shown 
in Fig 2a is reduced here by template matching to one TofFoli and two Cnot gates, h) The TofFoli gate is substituted by 
quantum primitives. According to Haghparast and Eshghi 's methods |4] the quantum cost is now 7. The moving rule is 
applied to circuit, i) New gates are introduced to yield quantum cost of Fredkin gate as 5. 

Suppose we have a template: f/iC/2f^3 £^4^^5=1 (where C/j is an unitary gate) and in the optimization procedure we 
come across a sequence of gates U2U3U4 then we can replace this sequence of gates by Ui^U^^ . 
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Figure 3: a) Reversible circuit for function 3_17 given in benchmark page of Maslov et aZ.|32|. b) Commutation rule is 
applied and arrow shows the movement of Cnot gate, c) NCT circuit before substitution of primitives d) Quantum circuit 
of 3_17 function obtained by substituting the TofFoli gates with primitives, e) Template matching tool is applied from 
[36] to the circuit, f) Quantum circuit with reduced gate count, g) Modified local optimization rule is applied and two 
movements have been done in the circuit as indicated by the arrows, h) New gates are introduced (each box is a new 
gate) and quantum cost of the circuit is obtained as the total quantum gates present in the circuit. The quantum cost of 
this circuit is 7. 



3. In this step we obtain an equivalent primitive circuit. This is done by decomposing every n qubit gates (where n ^ 
3) into equivalent circuit comprising of elementary gates (2 x 2 or 4 x 4 quantum gates). 

4. We optimize the circuit comprising of quantum primitive gates in the following steps. 

(a) We apply moving rule, deletion rule and modified template matching. In modified template matching a sequence 
of gates is substituted by another sequence of gates if it decreases the overall quantum cost of the circuit. In 
step 2 we have explained how a standard template matching reduces quantum cost by reducing the gate count. 
In modified template matching the overall cost is reduced by simultaneous application of template matching 
and introduction of new gates. Here we may substitute a sequence of gates by a larger sequence of gates if after 
the substitution, the gates present at the edge of the new sequence merges with the adjacent gates on the same 
qubit lines to reduce the overall quantum cost. It is explained in example 1 of this section. 

(b) We club together the adjacent gate/gates of dimension 2x2 and 4 x 4, 4 x 4 and 2x2, 2x2 and 2x2, 
4x4 and 4 x 4 to form new gates. In the circuit there may be other gates in the same qubit lines but not 
adjacent. In this step we will apply commutation rule and if the gates on the same qubit line or lines can be 
brought adjacent they will again form a new gate and reduce the cost. This is a modified optimization where 
we introduce new gate and apply the commutation rule to decrease the quantum cost of the circuit. 



A 



(c) Since new gates are formed in the procedure, the existing gates in the circuit may belong to another gate Hbrary 
and it is possible that templates for that particular gate library exist, hence we explore the further scope of 
minimization of gate count by template matching and deletion rule. We may require generating new templates 
for this procedure. 

5. We remove those gates, which do not affect the output or in other words affect only the garbage bits. When we 
substitute Toffoli gate by quantum primitives then there appear a lot of unnecessary quantum gates. This situation 
is similar to the garbage bits which are added to make an irreversible function reversible. Analogously these gates 
can be called as garbage gates. For example, if during computation the desired output of the circuit is obtained from 
the third qubit line in Fig. 2i then first two qubit lines at the output are garbage bits and the last two Cnot gates 
are garbage gates. Another example is a reversible function like 4mod5 [31] (Grovers oracle) whose output is 1 if 
the 4 bit input is divisible by 5. The circuit has one desired output and rest of the output bits are garbage bits. In 
this case when we apply our quantum cost minimization algorithm we find it helpful to remove those gates (garbage 
gates) that affect only the garbage bits. This unique feature of quantum cost optimization algorithm is applied in 
the present work to minimize the cost of 4mod5 dl circuit]^ (see Table 1). 

6. The quantum cost of entire circuit is obtained as the total number of quantum gates present in the circuit. 
To illustrate how this algorithm works let us consider following two examples. 

1. Consider a Fredkin gate and convert it to NCT circuit by applying a synthesis algorithm [S] as shown in Fig. 2a. 
We will calculate its quantum cost in two parts which is without optimizing the NCT circuit and after optimizing 
the NCT circuit. In the first part we substitute the Toffoli gate with its quantum primitives as shown in Fig. 2b. 
In Fig. 2c we have applied moving rule and indicated the movement by arrows. There are two places as shown in 
Fig. 2d where modified template matching can be applied and the resultant circuit is shown in Fig. 2e, here we 
have also marked the places where we can again apply templates. We obtain a circuit shown in Fig. 2f, we have 
marked in boxes the new gates and find that the quantum cost is 11. In the second part we will optimize the NCT 
circuit of Fredkin gate in Fig. 2a by applying template matching and obtain an optimized circuit as shown in Fig. 
2g and further the Toffoli gate is substituted by primitives shown in Fig. 2h. We have applied modified optimization 
rule (commutation is shown by arrow) and in Fig. 2i we have shown the new gates formed. The quantum cost of 
the circuit is 5. This example clearly establishes that it is very essential to optimize the reversible circuit before 
substituting it with its quantum primitives. This aspect is not mentioned in earlier works |33l 1551 [5^ E]. It also 
clearly explains the meaning of modified template matching protocol introduced in the present work. 

2. The reversible NCT circuit for function 3_17, is shown in Fig 3a (32- This is the input of our algorithm, in Fig 3b 
we have shown that the end Cnot gate will commute with adjacent Toffoli gate (thereby reduce the quantum cost) 
and the movement is shown by an arrow. The resultant circuit after commutation is shown in Fig. 3c. We try to 
optimize its gate count but we find that we cannot apply self inverse rule or template matching. We substitute the 
Toffoli gate with its primitives and the resultant circuit is shown in Fig. 3d. We try to optimize the circuit, there are 
two places shown in Fig. 3e where templates can be applied and after the application of templates we have obtained 
the circuit which is shown in Fig. 3f. Thereafter we apply modified optimization technique in Fig. 3g, new gates 
are formed which are shown in boxes in Fig. 3h. Finally we calculate total number of quantum gates in the circuit 
and find the quantum cost of the circuit . 

3.1 Quantum cost optimized circuits 

We have already mentioned that most of the existing results related to quantum cost are available in benchmark page of 
Maslov et al. [32] and in Revlib |38J. In addition to these two databases Mohammadi and Eshghi |4j, Gupta et al. [H] and 
Maslov et a/.[36i have independently reported the quantum cost of different reversible circuits. We have compared the 
quantum costs reported in these works with the quantum costs of the same functions obtained using the present algorithm. 
The results of comparison are shown in Table 1 - Table 3. To be precise in Table 1 we have compared the quantum costs of 
the following functions: i) modS function which is divisibility checker, ii) ham3 which is the size 3 hamming optimal coding 
function, iii) ham7 which is size 7 hamming optimal coding function, iv) hwb4 which is the hidden weighted bit function 
|47j with parameter N=4, v) 3_17 which is the worst case scenario 3 variable function [S] having function specification 
{7, 1, 4, 3, 0, 2, 6, 5} and vi) 4_49 which is the worst case scenario 4 variable function [5] having function specification 
{15, 1, 12 ,3 ,5 ,6 ,8 ,7 ,0 ,1 0, 13 ,9 ,2 ,4 ,1 4, 11}. Here we would like to note that in this paper we have mentioned 
the circuits described in |32j as benchmark circuits. To provide specific examples and to establish the superiority of our 
algorithm we have applied our algorithm to those benchmark circuits. Further, the benchmark circuits reported in |32| 
to realize a particular function is not unique and consequently different designs for the same purpose are marked with 
different indices, for example dl denotes design 1, d2 denotes design 2 etc. Here we have followed the same convention 
as it is used in [3H. Gupta et al. [B] have synthesized few reversible circuits for realization of above mentioned functions 
in the form of a network of Toffoli gates and have also reported their quantum costs. Further in [36 improved quantum 

*4mod5 dl stands for the design number 1 given in benchmark to realize 4mod5 function from NCT gate hbrary. 




Table 1: Comparison of quantum cost using our algorithm with the existing works of Maslov et al. [32], Maslov et al. 
and Gupta et al. [3]. 
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Table 2: Comparison of quantum cost using our algorithm with the existing works of Revlib |38| . 




Table 3: Comparison of quantum cost using our algorithm with the existing works of Mohammadi [4J. 




Table 4: Quantum cost of important quantum circuits. 

costs are reported for various circuits reported earlier [32^ . We have compared the quantum costs reported in these works 
in Table 1. 

In Table 2 we have reported quantum cost of circuits from Revlib |38| . To be precise, we have compared quantum 
cost of the following functions: i) miller gate, ii) 3_17 which is the worst case scenario 3 variable function [S] and iii) 
different designs of decode 24 function which is 2 to 4 binary decoder. Table 3 compares quantum costs of some circuits 
that has been reported in [4|. For example: i) two bit binary adder with carry input using one constant input (see Fig. 3a 
of U), ii) two bit binary adder with carry input using two constant input (see Fig. 3b of jj), iii) 9's complement circuit 
without constant inputs (see Fig. 4a of [4J) and iv) 2 x 2 bit multiplier (see Fig. 15 of |1]). The algorithm may be applied 
to other benchmark circuits too but to do so either one has to develop templates for the corresponding gate library or 
convert the circuit into other gate library for which templates has been provided in literature example NCT gate library. 
In Table 4, we have calculated quantum cost of some pure quantum circuits like EPR, quantum teleportation and shor 
code. Since quantum cost of these circuits have not been reported earlier, therefore its comparison could not be done. 
The quantum cost optimized circuits are shown in the first column of Table 1 - Table 4. The gates shown in the dotted 
box should form a new gate and it would be counted as a single gate in the calculation of quantum cost. 

4 Conclusions 

We have proposed an algorithm for minimization of quantum cost. We have applied our algorithm to different circuits 
from various sources [H [6l |32j [36l [38] and compared our results. The outcome of the comparison (see Table 1 - Table 3) 
clearly shows that the proposed algorithm produces best result. In Table 4 we have reported quantum cost of different 
quantum circuits (for example, quantum teleportation, EPR circuit etc.). Through these examples it is clearly established 
that the proposed algorithm is useful in reduction of quantum cost. Thus the present algorithm provides a window for 
reduction of quantum cost of other circuits in future. 
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